generating function
Sharp Concentration Inequalities: Phase Transition and Mixing of Orlicz Tails with Variance
In this work, we investigate how to develop sharp concentration inequalities for sub-Weibull random variables, including sub-Gaussian and sub-exponential distributions. Although the random variables may not be sub-Guassian, the tail probability around the origin behaves as if they were sub-Gaussian, and the tail probability decays align with the Orlicz $Ψ_α$-tail elsewhere. Specifically, for independent and identically distributed (i.i.d.) $\{X_i\}_{i=1}^n$ with finite Orlicz norm $\|X\|_{Ψ_α}$, our theory unveils that there is an interesting phase transition at $α= 2$ in that $\PPł(ł|\sum_{i=1}^n X_i \r| \geq t\r)$ with $t > 0$ is upper bounded by $2\expł(-C\maxł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $α\geq 2$, and by $2\expł(-C\minł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $1\leq α\leq 2$ with some positive constant $C$. In many scenarios, it is often necessary to distinguish the standard deviation from the Orlicz norm when the latter can exceed the former greatly. To accommodate this, we build a new theoretical analysis framework, and our sharp, flexible concentration inequalities involve the variance and a mixing of Orlicz $Ψ_α$-tails through the min and max functions. Our theory yields new, improved concentration inequalities even for the cases of sub-Gaussian and sub-exponential distributions with $α= 2$ and $1$, respectively. We further demonstrate our theory on martingales, random vectors, random matrices, and covariance matrix estimation. These sharp concentration inequalities can empower more precise non-asymptotic analyses across different statistical and machine learning applications.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Probabilistic Inference with Generating Functions for Poisson Latent Variable Models
Graphical models with latent count variables arise in a number of fields. Standard exact inference techniques such as variable elimination and belief propagation do not apply to these models because the latent variables have countably infinite support. As a result, approximations such as truncation or MCMC are employed. We present the first exact inference algorithms for a class of models with latent count variables by developing a novel representation of countably infinite factors as probability generating functions, and then performing variable elimination with generating functions. Our approach is exact, runs in pseudo-polynomial time, and is much faster than existing approximate techniques. It leads to better parameter estimates for problems in population ecology by avoiding error introduced by approximate likelihood computations.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States (0.04)
- South America > Brazil > São Paulo > São Paulo (0.04)
- (6 more...)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Canada (0.04)
- Asia > Singapore (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Consistency of Honest Decision Trees and Random Forests
Bladt, Martin, Lemvig, Rasmus Frigaard
We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under mild regularity conditions on the regression function and data distribution, we establish weak and almost sure convergence of honest trees and honest forest averages to the true regression function, and moreover we obtain uniform convergence over compact covariate domains. The framework naturally accommodates ensemble variants based on subsampling and also a two-stage bootstrap sampling scheme. Our treatment synthesizes and simplifies existing analyses, in particular recovering several results as special cases. The elementary nature of the arguments clarifies the close relationship between data-adaptive partitioning and kernel-type methods, providing an accessible approach to understanding the asymptotic behavior of tree-based methods.
On detection probabilities of link invariants
Kelomäki, Tuomas, Lacabanne, Abel, Tubbenhauer, Daniel, Vaz, Pedro, Zhang, Victor L.
We prove that the detection rate of n-crossing alternating links by many standard link invariants decays exponentially in n, implying that they detect alternating links with probability zero. This phenomenon applies broadly, in particular to the Jones and HOMFLYPT polynomials and integral Khovanov homology. We also use a big-data approach to analyze knots and provide evidence that, for knots as well, these invariants exhibit the same asymptotic failure of detection.
- Europe (0.93)
- North America > United States (0.28)
Probabilistic Inference with Generating Functions for Poisson Latent Variable Models
Graphical models with latent count variables arise in a number of fields. Standard exact inference techniques such as variable elimination and belief propagation do not apply to these models because the latent variables have countably infinite support. As a result, approximations such as truncation or MCMC are employed. We present the first exact inference algorithms for a class of models with latent count variables by developing a novel representation of countably infinite factors as probability generating functions, and then performing variable elimination with generating functions. Our approach is exact, runs in pseudo-polynomial time, and is much faster than existing approximate techniques. It leads to better parameter estimates for problems in population ecology by avoiding error introduced by approximate likelihood computations.